On-going Cooperative Research towards Developing Economy-Oriented Chinese-French SMT Systems with a New SMT Framework

نویسندگان

  • Yidong Chen
  • Ling Xiao Wang
  • Christian Boitet
  • Xiaodong Shi
چکیده

We present an on-going collaborative project pursued by Grenoble University and Xiamen University and aiming at creating instances of a new kind of SMT system using semantics and discourse-related resources. The concrete goal is to develop Chinese-French systems specialized to stock option and economic websites. Since very few Chinese-French bilingual corpora and dictionaries are freely available on Internet, English is used as a “pivot” for constructing the Chinese-French translation equivalents by transitivity. For this, we use a method, proposed by XMU, of probability induction based on topic similarity, which produces C-F translation tables from C-E and E-F translation tables. For getting good C-F parallel corpora, we use a web-based collaborative post-editing system that can trigger the incremental improvement of the MT system by using MT evaluation metrics and extracting the "best part" of the current translation memory. Mots-clés : traduction automatique statistique (SMT), chinois-français, domaine économique

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Chinese-English Statistical Machine Translation by Parsing

Statistical machine translation (SMT) has evolved from the word-based level to higher levels of abstraction. Currently the best known systems are phrased-based, and recent research has started to explore tree-based systems with syntactical information. This thesis aims to study large-scale Chinese-English SMT using a syntactic tree-based model. From the engineering point of view, SMT systems ar...

متن کامل

SMT for restricted sublanguage in CAT tool context at the European Parliament

This paper shows that it is possible to efficiently develop Statistical Machine Translation (SMT) systems that are useful for a specific type of sublanguage in real context of use even when excluding the exact Translation Memory (TM) matches from the test set in order to be integrated in CAT "Computer Aided Translation" tools. It means that the included part is quite different from the existing...

متن کامل

A Hybrid Machine Translation System Based on a Monotone Decoder

In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...

متن کامل

The CMU-UKA statistical machine translation systems for IWSLT 2007

This paper describes the CMU-UKA statistical machine translation systems submitted to the IWSLT 2007 evaluation campaign. Systems were submitted for three language-pairs: Japanese→English, Chinese→English and Arabic→English. All systems were based on a common phrase-based SMT (statistical machine translation) framework but for each language-pair a specific research problem was tackled. For Japa...

متن کامل

Semantics, Discourse and Statistical Machine Translation

In the past decade, statistical machine translation (SMT) has been advanced from word-based SMT to phraseand syntax-based SMT. Although this advancement produces significant improvements in BLEU scores, crucial meaning errors and lack of cross-sentence connections at discourse level still hurt the quality of SMT-generated translations. More recently, we have witnessed two active movements in SM...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014